Fine-grained Provenance for Linear Algebra Operators
نویسندگان
چکیده
Provenance is well-understood for relational query operators. Increasingly, however, data analytics is incorporating operations expressed through linear algebra: machine learning operations, network centrality measures, and so on. In this paper, we study provenance information for matrix data and linear algebra operations. Our core technique builds upon provenance for aggregate queries and constructs a K−semialgebra. This approach tracks provenance by annotating matrix data and propagating these annotations through linear algebra operations. We investigate applications in matrix inversion and graph analysis.
منابع مشابه
A Efficient Stream Provenance via Operator Instrumentation
Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS), not only to address complex applications that require diagnostic capabilities and assurance, but also for providing advanced functionality such as revision processing or query debugging. This paper introduces a novel approach that uses operator instrumentation, i.e., modifying the behavior of o...
متن کاملInferring Fine-Grained Data Provenance in Stream Data Processing: Reduced Storage Cost, High Accuracy
Fine-grained data provenance ensures reproducibility of results in decision making, process control and e-science applications. However, maintaining this provenance is challenging in stream data processing because of its massive storage consumption, especially with large overlapping sliding windows. In this paper, we propose an approach to infer fine-grained data provenance by using a temporal ...
متن کاملPanda: A System for Provenance and Data
Panda (for Provenance and Data) is a new project whose goal is to develop a general-purpose system that unifies concepts from existing provenance systems and overcomes some limitations in them. Panda is designed for “data-oriented workflows,” fully integrating data-based and process-based provenance. Panda’s provenance model will support a full range from fine-grained to coarse-grained provenan...
متن کاملThe Case for Fine-Grained Stream Provenance
The current state of the art for provenance in data stream management systems (DSMS) is to provide provenance at a high level of abstraction (such as, from which sensors in a sensor network an aggregated value is derived from). This limitation was imposed by high-throughput requirements and an anticipated lack of application demand for more detailed provenance information. In this work, we firs...
متن کاملProvenance for SQL through Abstract Interpretation: Value-less, but Worthwhile
We demonstrate the derivation of fine-grained whereand why-provenance for a rich dialect of SQL that includes recursion, (correlated) subqueries, windows, grouping/aggregation, and the RDBMS’s library of built-in functions. The approach relies on ideas that originate in the programming language community—program slicing and abstract interpretation, in particular. A two-stage process first recor...
متن کامل